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This paper considers a class of qubit channels for which three states are always sufficient to achieve 
the Holevo capacity. For these channels it is known that there are cases where two orthogonal 
states are sufficient, two non-orthogonal states are required, or three states are necessary. Here a 
systematic theory is given which provides criteria to distinguish cases where two states are sufficient, 
and determine whether these two states should be orthogonal or non-orthogonal. In addition, we 
prove a theorem on the form of the optimal ensemble when three states are required, and present 
efficient methods of calculating the Holevo capacity. 
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I. INTRODUCTION 

A quantum channel is a completely positive and trace 
preserving (CPTP) map on quantum states. The condi- 
tion that it is completely positive means that the result 
of the map is a positive operator, and therefore may rep- 
resent the state of a system, even if the map acts on 
one part of an entangled system. The condition that it is 
trace preserving ensures that the final state is normalised. 
In contrast to unitary operations, quantum channels can 
increase the entropy of a state. A quantum channel arises 
if an ancilla space is added, a unitary operation is per- 
formed between the system and the ancilla, then the an- 
cilla is traced over to obtain the reduced density operator 
for the system. 

Quantum channels are used to model communication 
channels, and therefore an important quantity to con- 
sider for these channels is the amount of classical com- 
munication that may be performed. This is often quan- 
tified by the Holevo capacity. The Holevo capacity of a 
quantum channel $ is given by 



C($) = sup 

Pi, Pi 



(1) 



where p = ^^PiPi, and S{a) = — Trcrlog2(T is the von 
Neumann entropy. The pi are probabilities, and therefore 
must be non- negative and sum to 1. The Holevo capacity 
is the asymptotic classical communication that may be 
achieved using joint measurements on output states, but 
unentangled inputs 0, 0- In general determining the 
Holevo capacity of a channel is a nontrivial task. For the 
class of channels considered here, it will be shown that 
the capacity may be determined in a straightforward way. 

An important issue is the number of states pi that must 
be considered in the maximisation. It is well known that, 
for quantum channels that act upon a Hilbert space of 
dimension d, the number of states in the ensemble need 
not exceed 3]. In particular, for a qubit channel no 
more than four states are required. For the very sim- 
ple case of unital qubit channels, where $(1) = 1, the 
capacity is achieved for two orthogonal input states 0. 
For more general qubit channels, the capacity may be 
achieved for two non-orthogonal inputs 5], three states 
[3, or four states may be required 

With the exception of the channels considered in Ref. 



0, these results are all for a class of channels that can 
require at most three states. Here we give simple criteria 
for these channels that, when satisfied, mean that two 
states are sufficient. These criteria are not satisfied by 
the channels that require three states given in Ig, but 
are satisfied by examples given in Refs. 11 SB Ig where 
two states are sufficient. In addition, we give criteria to 
determine when the input states should be orthogonal or 
non-orthogonal. 

This paper is organised as follows. We present the 
proof of the criteria in Sec.m Then, in Sec. IIIII we give 
applications of the result to results presented in previous 
work. We consider the form of the optimal ensembles for 
those cases where three states are required in Sec. lIVI In 
Sec. we show how our results may be applied to the 
calculation of the Holevo capacity. Conclusions are given 
in Sec. ED 

II. TWO STATE ENSEMBLES 

To obtain the results, we use the representation of the 
qubit channel on the Bloch sphere. A general qubit den- 
sity operator may be expressed as 



(2) 



where a is the vector of Pauli operators (ctx, fj,, tr^)"'". 
The length of the vector r does not exceed 1, and its 
components give the position of the state in the Bloch 
sphere. A qubit channel $ maps the sphere of possible 
input states to an ellipsoid, and may be expressed as 



$(p) = -[l + (Af+t)-a]. 



(3) 



That is, the channel $ produces the mapping r i— > Ar + t. 
Via local unitary operations before and after the map, the 
transformation matrices A and t may be brought to the 
form 



A = 
















A2 













A3 / 







(4) 



That is, an arbitrary qubit channel $ may be expressed 
as $ = Tjj o $4 A o Fy, where Tjj and Fy are unitary 
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channels, and <I>t.A is the channel with A and t given by 
For this study, we consider the restricted case of 
channels <i> such that the x and y components of t are 
zero, and use the notation t = t^. Hence t is given by 



(5) 



In order to evaluate the Holevo capacity, we use an 
approach similar to that of Ref. @ . The Holevo capacity 
may be given by the following expression 0, : 

C($) = min maxi^($(po)||*(^o)), (6) 

V'o Po 



where D is the relative entropy 

DipW-ip) = Tr(plogp- plog?/;). 



(7) 



Throughout this paper we use the convention that "log" 
and "exp" are base 2, and logarithms base e are given 
as "In" . The relative entropycan be evaluated using the 
following useful result from 



D{PU) = 2 [/W - - -rcos{e)f{q)\ , (8) 
where 



!{x) = (1 + x) log(l + x) + (1 - x) log(l - x), (9) 

■'^"^ (10) 



fix) = log {j 



The Bloch vectors for p and "0 are r and (f, respectively, 
and we also define r = g = cos(6') = r ■ q/rq. 

To evaluate the Holevo capacity, we consider the action 
of the simplified channel ^t,A- This channel has the same 
capacity as because unitary operations do not affect 
the capacity. The set of possible output states from the 
channel <&t,A forms an ellipsoid centred on the z axis. 
The ellipsoid has a radius of |Ai| in the x direction, and 
a radius of IA2I in the y direction. 

The nature of the optimal ensemble may be determined 
by considering the states in the minmax formula ©. In 
the following we take the states p = ^t,A{po) and V' — 
^t,A{tpo) to be output states from the simplified channel. 
If ip is the average output density operator for an optimal 
ensemble, the operators pk that maximise -D(pfcllV') are 
possible output states for this ensemble. It is necessary 
that there is some set of such that ^i^PkPk = V'- The 
optimal ensemble is not necessarily unique, because there 
may be different ways of choosing the probabilities such 
that ^i^PkPk = Ip- However, from Ref. the optimal 
average output state is unique. 

As we are restricting to operations such that t lies 
on the z axis, there are many simplifications due to the 
symmetry of the system. Many of these simplifications 
were used in Ref. Q in the analysis of the amplitude 
damping channel. We give a general explanation here. 



Firstly, the optimal state ip must lie on the z axis. To 
show this result, for any pair of states p and ip, consider 
the second pair p' and ip' , where f* — {—rxT—ry,rz)^ 
and (f — {—qx,—qy,qz)'^- Due to symmetry, if p and ip 
are possible output states, then so are p' and -0'. From 
the symmetry of the relative entropy, it is evident that 
D{p\\i/j) = D{p'\\il;'). This immediately implies that 
maxp D{p\\'ip) = maxp Z?(/o|| ?/>'). Therefore, if ip min- 
imises this quantity, then so does ip' . However, as the 
optimal average output state is unique, ip and ip' must 
coincide, which implies that ip lies on the z axis. 

In the case that |Ai| ^ IA2I, the pk that maximise the 
relative entropy will lie in the x — z plane if |Ai| > IA2I, 
and the y — z plane if |Ai| < IA2I. That is because ^ 
lies on the z axis, so the relative entropy is symmetric 
under rotation about the z axis. If |Ai| > IA2I, then the 
ellipsoid has a radius in the x direction larger than the 
radius in the y direction. Consider any state p that is not 
in the x — z plane. We can determine a second state p' in 

the X — z plane with Bloch vector — ( '\Jtx + : 0? ^z)"""- 
This state is in the interior of the ellipsoid, and we may 
obtain a third state on the surface of the ellipsoid, p" , by 
extending outwards in a straight line from ip. From Ref. 
's'l (the first lemma in Sec. 5.3), 



Dip"\\pj) > Dip'U) = D{pU). 



(11) 



This implies that p does not maximise the relative en- 
tropy. Hence, all pk that maximise the relative entropy 
must be in the x — z plane. Similarly, if |Ai| < IA2I, the 
ellipsoid has a radius in the y direction larger than the 
radius in the z direction, and the optimal pk must be in 
the y — z plane. 

In the case that |Ai| = IA2I, the situation is a httle 
more complicated. For each optimal pk, there is a circle of 
optimal density operators around the z axis. However, in 
order to obtain an optimal ensemble, it is only necessary 
to use non-zero probabilities such that 'YPikPkPk = i^- As 
Ip lies on the z axis, it is sufficient to take pk from a single 
plane in the Bloch sphere that contains the z axis. 

This reasoning means that, regardless of the relative 
values of I All and IA2I, we may restrict to considering pk 
that maximise D[pk\\ip) in a single plane in the Bloch 
sphere. Caratheodory's theorem implies that there need 
be no more than three states in the ensemble. This fact 
was also noted in Ref. @- The examples given by Ref. 
which needed four states used t that were not on the 
z axis. 

In fact, in some cases the number of states required 
is only two 5], though in some cases three are required. 
Here we give criteria that can show when only two states 
are required via the following theorem: 

Theorem 1. For a CPTP map $ = o a o Ty 
with A given by |0J and t given by lO, «/ A,„ = IA3I or 
Ai (0,1/2), where 



^3 



- 1 + Ai + 



(12) 
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and Xm = niax(|Ai|, IA2I), then there is an ensemble that 
gives the maximum output Holevo information and has 
two states. 

Before we proceed to the proof, we give some explana- 
tion of the quantity A. Let us consider the output eUipse 
in the x — z plane if |Ai| > IA2I, or the y — z plane if 
I All < IA2I. A point on the surface of this ellipse has a 
distance from the origin r, which is given by Eq. H16|l in 
the proof below. Taking the derivative of with respect 
to 4> gives 



,(r2) = 2sin0[(A2 



A3) COS0 - 



z plane, the set of output states that it is sufficient to 
consider forms a line. The result again follows from the 
fact that there are only two extremal states. For the 
remainder of the analysis we take t ^ 0, A3 ^ 0, and 
assume that no more than one of the A^ is zero. This 
third assumption means that A™ 7^ 0. 

For the remainder of this proof we consider the in- 
put and output states for the simplified channel ^t,K- 
The input and output states for the total channel $ will 
simply be rotated from these states. We take the in- 
put state to have r = (sin </>, 0, cos (/>)""" for |Ai| > IA2I, 
or f — (0, sin (/), cos (/>)""" for |Ai| < IA2I. The output 



. (13) state will then have f — (Ai sin 0, 0, i -I- A3 cos (j))^ or 



This expression is zero if sin0 — 0, Aj„ — /\3 



\?.t 



A? = A3t = 0, 



A™ 



x2 ■ 



(14) 



The third case is only possible if the absolute value of 
the right-hand side (RHS) does not exceed 1. If it does 
not, then substituting this expression for cos 4> into the 
expression for r gives the extremum 



t^Xl 



X2 



+ \t+t^^A + l. 



(15) 



Therefore, in this case, A is the difference between the 
square of an extremum of r and 1. In the case A^ — A| = 
Ast = 0, the radius is independent of (j). This possibility 
will be excluded in the discussion of A, because A™ — 
is an alternative criterion to A ^ (0, 1/2), and leads to 
infinite A. 

If A were positive, then r^^ would be larger than one, 
which is not possible for CFTP maps. Therefore, for any 
map such that an extremum of r is obtained for sin 7^ 
(and A„i ^ IA3I), the condition A ^ (0, 1/2) is automati- 
cally satisfied due to the fact that states can not mapped 
outside the Bloch sphere. However, A ^ (0,1/2) is not 
satisfied for every possible CPTF map, because for some 
|A3V(A?„-A2)|>1. 

Another case where A ^ (0, 1/2) is automatically sat- 
isfied is when Am < \X^\. That is because the condition 
that the map is CPTP implies that A^ -I- 1'^ < 1 , and if 
Am < IA3I then t^\\/{\^ — A3) is negative. Therefore, 
from the definition of A, it is clear that ^ < 0. We now 
proceed to the proof of the theorem. 

Proof. We begin the analysis by mentioning some trivial 
cases that would otherwise complicate the analysis. If 
t = 0, then the channel is unital, and the result in this 
case was proven in Ref. 4]. If all three of the Afe are zero, 
then the channel capacity is zero, and the result is trivial. 
If two of the Afe are zero, then the possible output states 
form a line in the Bloch sphere, and the result follows 
from the fact that there are only two extremal output 
states. 

The result is also trivial if A3 = 0. In that case, since 
we may restrict to considering states in the x — z 01 y — 



(0, A2 sin0, t-1- A3 cos 0)'^. The state ■0 has q— (0, 0, gz)""". 
In either case, we have for the output 



r = \J A^ sin^ + (t + A3 cos 0)^, 
r COS0 = (i + A3 COS0) X sign(gz). (16) 

To search for the optimal p, it is merely necessary to 
search for the optimal (j). Because sign{qz)f'{q) = f'{qz), 
we may write the relative entropy as 

^(pII^) = I [fir) - log(l - g.) - (t + A3 cos0)/'(gz)] . 

(17) 

The derivative of D{p\\ip) with respect to cj) is 

= I {[(A^ - \l) cos<P- tA3]/'(r)/r + /'(q.)A3} sin0. 

(18) 

There will be extrema of D{p\\'tjj) for — and — n, 
as well as when 

[(A^ - A2)cos0 - tX3]f'{r)/r - -/'(<z,)A3. (19) 

We will consider the solutions of this equation for <j) in 
the interval (0, tt). Any solution in (0,7r) will yield a 
corresponding solution in (— 7r,0) due to symmetry. 
Taking the derivative of the left-hand side (LHS) gives 

^[(A^ - Xl) COS0 - tX,]f'ir)/r = |-(A^ - A^)^ 

+ [{Xl - Xl) cos - th?~ (^) } sin </). (20) 

In the case that | A,„ | 7^ | A3 1 , 

[{Xl - Xl) COS0 - tX,]^ = {Xl - Xl){l -r^+ A). (21) 
We then obtain 

-^[(A^,-A^)cos0-a3]/'(r)/r- 



(Am-A§)sin(/) 



[h{r) + Ag{r)], (22) 
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where 

I , d ffir) 



dr \ r ) (1 — r2)r In 2 r-^ ^ \ 1 — r 



r r m 2 V 1 — 



The functions ^(r) and h(r) satisfy the inequahties 
g(r) > 0, h{r) < 0, 2h{r) + g{r) > 0, 



(23) 
(24) 

(25) 



for r G (0, 1). li A < 0, then h{r) + Ag{r) is negative 
for r e (0, 1). Similarly, if A > 1/2, then h{r) + Ag{r) 
is positive for r G (0, 1). In either case h{r) + Ag{r) has 
constant sign. We do not need to consider the possibil- 
ity that r = 0, because this value is only possible when 
sin = (for Am 7^ 0) . 

The case where r = 1 is more complicated. It is pos- 
sible for r to be equal to 1 for g (0,7r). In the case 
where r has a maximum for g (0, tt), the maximum 
value of r is A + 1. If r is equal to 1 for e (0, tt), this 
must be a maximum, and therefore A = (as we are 
taking Am ^ l^sD- That implies that the expression in 
square brackets on the LHS of Eq. lfTO|l is proportional 
to Vl — r'^- Hence the LHS of H19|l approaches zero as 
r approaches 1, and is continuous as a function of (f) for 
4> e (0,7r). As h{r) + Ag{r) has constant sign for all val- 
ues oi (f) G (0, tt) except where r = 1, and the LHS of H19|) 
is continuous where r = 1, the LHS of H19|l is one-to-one 
in this interval. 

For the case Am — IA3I, 

^[(A^^ - A^) COS0 - tX,]r{r)/r ^ f\l{smcb)g{r)/r. 

(26) 

Therefore, the derivative of the LHS of H19() is nonzero 
for (j) e (0, tt). Note that we are assuming that t 7^ and 
A3 7^ 0, so the RHS of Eq. I|26|l is nonzero. Thus we have 
shown that, regardless of the relative values of Am and 
A3, the LHS of (|19(l is a one-to-one function of (p, and 
there can be at most one solution of H19|) in (0, tt). If 
there is a solution, it must correspond to an extremum, 
because a point of inflection would conflict with the fact 
that the LHS of (|19|1 is one-to-one. 

As D{p\\ip) is symmetric about (p — 0, there must be 
two solutions of H19I) with sin0 ^ or none. In the case 
where there are no solutions, there are only two extrema 
(for (/) = and tt), and only one of these can be a max- 
imum. This is not consistent with ■0 being optimal, be- 
cause the optimal ensemble can not have only one state. 
Therefore, if "0 is optimal, then there must be two solu- 
tions of (|19|l . As the maxima and minima alternate, the 
maxima are either at (j) — and tt, or the solutions of 

In the case that |Ai| ^ IA2I, this result immediately 
implies that there are only two states in the optimal en- 
semble. In the case |Ai| = IA2I, if the maxima correspond 
to the solutions of (|19|l . optimal ensembles may contain 



any states in a ring about the z axis. However, as dis- 
cussed above, it is only necessary to consider pk in one 
plane in the Bloch sphere in this case, so there is again 
an optimal ensemble with two members. □ 

It is also possible to determine simple criteria for when 
the optimal states in the ensemble are on the z axis, and 
when the optimal states in the ensemble correspond to 
the maxima for sin^ 7^ 0. The result is: 

Theorem 2. Let ^t,A be a CPTP map with A 7^ given 
by Q and t given by The condition that Am = IA3I 
or A ^ (0, 1/2) may be expressed as two alternative mu- 
tually exclusive conditions: 
Condition 1. Am < IA3I or A > 1/2 
Condition 2. Am > IA3I and A < 

If Condition 1 is satisfied, the optimal ensemble consists 
of two states on the z axis. If Condition 2 is satis- 
fied, there is an optimal ensemble consisting of two states 
equidistant from the z axis and lying on a line perpendic- 
ular to and intersecting the z axis. 

Here we have given the result in terms of the simplified 
map $t,A, rather than expressing it in terms of the arbi- 
trary map <&. That is because the ellipse of output states 
will be rotated for the arbitrary map, so it is not possible 
to express the result in this way. The statement of this 
theorem also differs in that A is taken to be non-zero. 
This is to exclude the trivial case where all ensembles 
give zero Holevo information. 

Proof. As was shown above. Am < IA3I also implies that 
A < 0. Another consequence of this is that, if A > 0, 
then Am > jAsj. Therefore Condition 1 contains three 
alternatives: 

1. Am = IA3I 

2. Am < I A3 1 and A < 

3. A> 1/2 and A™ > IA3I 

It is clear that, for each of these three alternatives, the 
conditions of Theorem ^ must hold. If none of these 
alternatives apply, but A ^ (0, 1/2), then Am > IA3I and 
A < 0, which is Condition 2 given in the theorem. 

To determine which extrema of Z?(p||'0) are maxima 
and which are minima, it is sufhcient to consider the 
point = 0. At this point, the second derivative of 
D{p\\ip) is given by 



Al)-<A3]/'(r)/r + /'(g,)A3}. 

(27) 

We know that the LHS of H19(l is one-to-one, and there 
must be at least one solution of (|19|) if ijj is optimal (oth- 
erwise there would be only one possible state for the en- 
semble) . 

If Am = I A3 1, then from the LHS of is mono- 
tonically increasing for G (0,7r). If A > 1/2 and 
Am > IA3I, then h{r) -\- Ag{r) > 0, and from the 
LHS of H19|) is monotonically increasing. Similarly, if 
Am < IA3I and A < 0, then h{r) -\- Ag{r) < 0, and the 
LHS of H19|) is again monotonically increasing. Therefore, 
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for all three alternatives for Condition 1, the LHS of H19() 
is monotonically increasing for (p g (0, tt). For Condition 
2, A„ > I A3 1 and A < 0, so h{r) + Ag{r) < 0, and the 
LHS of H19I) is monotonically decreasing for <j) £ (0,7r). 

If the LHS of (|19|l is monotonically increasing for <j) E 
(0,7r), the LHS of (0 must be less than the RHS for 
= 0, so 

[{Xl - Xj) - tX,]f'ir)/r + f'{q,)X3 < 0. (28) 

This means that the second derivative of D{p\\4>) is neg- 
ative for ^ = 0, and D{p\\ip) is a maximum at this point. 
Hence, the two maxima are obtained for </> = and tt, 
and these values correspond to the states in the optimal 
ensemble. Thus we see that, for Condition 1, the LHS of 
H19|l is monotonically increasing and the optimal ensem- 
ble consists of two states on the z axis. 

Alternatively, for Condition 2, the LHS of lfTO|l is 
monotonically decreasing, so the LHS of H19|l is greater 
than the RHS for (p — 0, and less for = tt. This implies 
that the second derivative of D{p\\ip) is positive for = 
and (j) — n, and these points are minima. Hence, in this 
case the states in the optimal ensemble correspond to the 
extrema of D{p\\ip) for sint/) 7^ 0. 

In the case that |Ai| > IA2I or |Ai| < IA2I, the optimal 
ensemble must be in the x — z plane or y — z plane, re- 
spectively. In either case, two maxima are obtained in 
the appropriate plane for (jj — ±(j)Q, where cpQ maximises 
D{p\\ip). These two solutions are equidistant from the z 
axis, and on a line perpendicular to and intersecting the 
z axis. If I All = IA2I, then there will be a circle of states 
about the z axis that maximise the relative entropy. Op- 
timal ensembles may contain any number of these states. 
However, as discussed above we may restrict to states in 
one plane. This yields an ensemble with two members 
that again lie on a line perpendicular to and intersecting 
the z axis. □ 

Another issue is the position of the optimal average 
output state. It is possible to use similar techniques as 
above to show that this state should be further from the 
centre of the Bloch sphere than the output for the max- 
imally mixed state. Specifically, Qz for the optimal aver- 
age output state should satisfy Qz/t > 1 for t and A3 both 
nonzero. The case t = means that the map is unital, 
and it is known in that case that 92 = is optimal. If 
A3 = 0, then clearly — t. 

To show this result, let us assume some value for Qz, 
(the other components of q are zero), and take a value 
of (j> such that \t + A3 cos0| > |i — A3 cos0|. We denote 
the states with = i ± A3 cos by p± . Determining the 
difference in relative entropies gives 

D{p+U) ~ D{p^U) 

^ f{r+)-f (r^)- 2X3 cos cj>f'{qz) 

> /'(f)(r+-r_)-2A3Cos0/'(<72), (29) 

where r± is the magnitude of the Bloch vector for p± , and 
f = {r^+r^)/2. In the second line we have used the strict 



convexity of f'{r) and the Hermite-Hadamard inequality 
[rsf. Now using the fact that — = AtXsCOScj), we 
have r_|- — r_ = (2t A3 cos (j>) jr. Therefore Eq. (|29|) sim- 
plifies to 

D{p^\\i,) - D(p_|lV') > 2tA3Cos0[/'(f)/f - /'(g,)A]. 

(30) 

We have chosen such that tX-^ cos (p is positive, and both 
f {x) and j' (x) jx are monotonically increasing functions. 
Also f > t, with equality only if Am sine/) = 0. Therefore, 
Qz/t < 1 implies that 

D{p+U) ~ D{p^U) > 0. (31) 

This means that, if t is positive and qz < t, then all 
states p_ that have z component of their Bloch vector less 
than t do not maximise the relative entropy. In addition, 
if Qz = t the relative entropy can not be maximised for 
= t. In the case A,„ = this is trivial, because the 
maxima are for = < + A3 and Tz = t — A3. If Am 0, 
then f'{r)/r > f'{t)/t. As we are also taking A3 ^ 0, 
this inequality means that Eq. H19|) can not be satisfied 
for (f) = 7r/2. 

Hence, for (/z < ^ > and A3 7^ 0, all pk that maximise 
the relative entropy must have a z component of their 
Bloch vector greater than that for -0, and they can not 
give an average equal to ip. This is not consistent with 
ip being the average state for the optimal ensemble, and 
therefore the average state for the optimal ensemble must 
satisfy qz > t. Similarly, if t is negative and A3 7^ 0, 
then the average state for the optimal ensemble satisfies 
qz < t. 

With the aid of this result, we can alternatively express 
Theorem |21 in terms of the orthogonality of the input 
states. The result is: 

Corollary 1. Consider a CPTP map $ = Vijo^t j^oVv 
with A 7^ given by Q and t given by (jSj. The condi- 
tion that Am = I A3 1 or A ^ (0, 1/2) may be expressed as 
two alternative mutually exclusive conditions: 
Condition 1. Am < | A3 1 or ^ > 1/2 
Condition 2. Am > I'^sl md ^ < 

If t =/= and A3 7^ 0, the maximum output Holevo in- 
formation is obtained for two orthogonal input states if 
Condition 1 is satisfied, and two non- orthogonal input 
states if Condition 2 is satisfied. 

Proof. Note first that unitary operations do not change 
the orthogonality relations between the states. Therefore 
it is sufficient to prove the orthogonality relations for the 
simplified map $t_A- For Condition 1 the result follows 
immediately from Theorem |21 The two input states are 
the extremal states on the z axis, and therefore are |0) 
and |1), which are orthogonal. 

To prove the result for Condition 2, we use the result 
that, for t ^ and A3 7^ 0, qz is not equal to t. If the 
input states for Condition 2 were orthogonal, then that 
would lead to qz = t. Therefore, ii t ^ and A3 7^ 0, 
the input states must be non-orthogonal if Condition 2 
holds. □ 
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III. APPLICATIONS 

These results allow us to make sense of the results ob- 
tained in previous work. In particular, Q found that only 
two states in the ensemble were required for the ampli- 
tude damping channel, where Ai = A2 = A3 = /i 
and t = 1 — jjL. We find that, in this case, A = 0, so 
A ^ (0, 1/2) is satisfied and Theorem ^ predicts that the 
optimal ensemble requires two states. For this channel. 
Am > I A3 1 and A < Q, which corresponds to Condition 
2 in Theorem 121 Theorem |5| therefore predicts that, for 
this channel, the optimal ensemble consists of two states 
at the same distance from the x — y plane, rather than 
on the z axis. This is what was found in Ref. 

Another channel is the shifted depolarising channel, 
which was considered in Ref. 0| . For this channel, Afc = /j, 
and i = 1 — /i. As A„i = A3, Theorem ^ applies, and the 
ensemble should require only two states. This result is 
what was found in Also, because Am = A3, Condition 
1 in Theorem |3 holds, so Theorem |21 predicts that the 
states in the optimal ensemble lie on the z axis. This is 
also consistent with the results of Ref. 0| . 

On the other hand, let us consider the examples given 
in 1^ that require three states. For one of these examples, 
Ai = A2 = 0.6 and A3 = t = 0.5, so A « 0.178. This is 
in the interval (0, 1/2), so it is not surprising that three 
states are required. Another example is Ai = t = 0.5 and 
A2 = A3 = 0.435; in this case A is about 0.278, which is 
again in the interval (0, 1/2). 

In Ref. 0] a strategy used to find channels that require 
three states was to vary the parameters from a channel 
such that the optimal states are on the z axis to one 
where the optimal states are away from the z axis. This 
strategy can alternatively be explained in terms of Theo- 
rem |21 The channel parameters can not be continuously 
varied from Condition 1 to Condition 2 without A pass- 
ing through the interval (0,1/2). That is because it is 
not possible to continuously vary the channel parame- 
ters from Am < I A3 1 to Am > IA3I while maintaining the 
same sign for A. 

To take an example from (|] , let A3 = t = 1/2, and vary 
Am- Then the variation of A and A^ — A3 are as in Fig. 
n It can be seen from this figure that as A^ — A| passes 
through zero, A switches from negative to positive. In 
fact the only point where Condition 2 is satisfied is for 
Am = l/\/2- In passing from Am = 0.5, where Am — A3, 
to Am — l/\/2, the value of A passes through (0, 1/2). 

A case of particular interest is that where A and t are 
given by 

/ cosS \ f ^ \ 

A = C0S7 , r= 

y cos 7 cos S J y sin 7 sin 5 y 

(32) 

This type of channel arises naturally when considering 
qubit interactions. If one introduces an ancilla qubit, 
performs a unitary operation, then traces over this ancilla 
qubit, the resulting operation is of this form ^2 ■ Maps of 
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FIG. 1: The values of A (solid line) and A^ — A3 (dashed line) 
as a function of Am for A3 = t = 1/2. The shaded region shows 
the region of values of A such that the optimal ensemble may 
require three states. Results for Am > 1/v^ are not shown, 
because the maps for Am > I/V2 are not CPTP. 

this form also arise naturally when considering extremal 
maps 13]. Also, it is known that all qubit maps with two 
Kraus operators are of this form 

For maps of this form, we find that A = 0, so the condi- 
tions of Theorem^are satisfied. Therefore, for maps that 
arise from a unitary interaction with an ancilla qubit, the 
optimal ensemble requires only two states. This result 
was also claimed in Ref. , although the complete proof 
was not given. In addition, IA3I < Am, so from Theorem 
El the two states for the optimal ensemble are away from 
the z axis. 



IV. THREE STATE ENSEMBLES 

In the case where three states are required for the opti- 
mal ensemble, it is possible to show that one of the states 
needs to be on the z axis. The result is 

Theorem 3. Consider a CPTP map ^t,A with A given 
by and t given hy If the Holevo capacity can not 
be achieved with a two-state ensemble, then any optimal 
ensemble with three states consists of one state on the z 
axis, and two states equidistant from the z axis and on 
a line perpendicular to and intersecting the z axis. The 
optimal input state on the z axis is |0) i!/|t-|-A3| > |t — A3I, 
and |1) if\t + \3\ < \t-\3\. 

Proof. In order to prove the result, we start by consider- 
ing the expression in square brackets in l|22|l . Although 
h(r) + Ag{r) can change sign, it is only zero for one value 
of r. To show this result, we use the following facts: 

hir)<0, .g(r)>0, g'ir)>0, (33) 
h'{r)g{r)-g'{r)h{r) >0. (34) 
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These inequalities are all for r G (0, 1), and are easily 
checked by plotting the functions. If h{r) + Ag{r) > 
for r = To, then A > ~h{ro)/g{ro), so h' (ro) + Ag' (ro) > 
{h'{ro)g{ro)-g'{ro)h{ro))/g{ro)>0. Therefore, ifh{r) + 
Ag{r) > for r = rg, then h{r) + Ag(r) is increasing for 
r = ro- This implies that, if there is a value of r for which 
h{r) + Ag{r) = 0, then h{r) + Ag{r) > for all larger 
values of r. Hence h{r) + Ag(r) can be zero for only one 
value of r in (0, 1). 

Recall that, if there is an extremum of r for sin0 ^ 0, 
then the condition A ^ (0, 1/2) is satisfied, and therefore 
the optimal ensemble requires no more than two states. 
In the conditions for Theorem O the optimal ensemble 
requires more than two states, so r has no extremum 
for sin0 7^ 0. Hence r is a one-to-one function for cj) in 
the interval (0,7r). Combining this result with the above 
reasoning, the RHS of (|22() can be zero for only one value 
of (f> in the interval (0, tt). 

These results imply that the LHS of H19I) can have 
a turning point for only one value of (f) in (0,7r), and 
therefore there are at most two solutions of H19|l for e 
(0, tt). In turn this implies that there are no more than 
two extrema of D{p\\ip) for (j> g (0, tt). In fact, there 
must be exactly two (if ip is optimal), because if there 
were only one, then the optimal ensemble would require 
only two states, which violates the conditions of Theorem 

m 

Thus there will be two extrema of D{p\\tp) for cf) e 
(0, tt), two symmetric extrema for (j) e (— tt, 0), and ex- 
trema at = and tt. These extrema must alternate 
between minima and maxima, and so one of the extrema 
at = and tt will be a maximum, and the other will 
be a minimum. To determine which points are minima 
and which are maxima, consider the second derivative of 
D{p\\ip) at a solution of l(T^ : 



■[h{r) + Agir)]. (35) 



2r 

Recall that, if h{r) + Ag{r) is positive for r = ro, it must 
also be positive for r > rg. Therefore, for the solution of 
(|19|l with smaller r, h{r) + Ag{r) is negative, and for the 
solution with larger r, h{r) + Ag(r) is positive. 

For maps that require three states to achieve the 
Holevo capacity, A> 0. As discussed above, this implies 
that Am > I A3 1, so A^ — A| is positive. Thus multiplica- 
tion by A^j — A| does not change the sign, so the solution 
of (|19|l with smaller r is a maximum, and the solution 
with larger r is a minimum. As the extrema alternate be- 
tween maxima and minima, the extremum on the z axis 
that is closer to the origin must be a minimum. There- 
fore, if |t + A3I is greater than \t ~ A3I, then the optimal 
output state on the z axis will be at t + X3. This corre- 
sponds to an input state of |0). Similarly, if |i — A3I is 
greater than |< + A3I, then the optimal output state on 
the z axis is at t — A3, which corresponds to the input 
state |1). 

The two remaining states in the optimal ensemble will 
correspond to solutions (j) = ±0o of H19|) . In the case that 



I All ^ IA2I, these states are in the x — z or y — z plane 
of the Bloch sphere, depending on whether |Ai| > IA2I or 
I All < IA2I. In either case the states are equidistant from 
the z axis, on a line that is perpendicular to and inter- 
secting the z axis. If |Ai| = IA2I, then optimal ensembles 
may contain any states from a circle about the z axis. 
However, for optimal ensembles with three states, the 
condition that the mean state is on the z axis restricts 
the remaining two states to be equidistant from the z 
axis, and on a line perpendicular to and intersecting the 
z axis. □ 



V. CALCULATING CAPACITIES 

These results enable us to determine numerically efh- 
cient ways of calculating capacities. In the case that the 
channel satisfies the conditions of Theorem ^ the prob- 
lem becomes particularly simple. First it is necessary to 
check whether it is Condition 1 or Condition 2 in The- 
orem |21 that is satisfied. For Condition 1, the optimal 
ensemble consists of the two extremal states on the z 
axis. The probabilities may be determined by the fact 
that D{pi\\ip) = D{p2\\il')- The expression for the rela- 
tive entropy ((HJ simplifies to 

D{pU) = \ [f{r.) - log(l - ql) - r,/'(g,)] . (36) 
The condition that DlpiWijj) ~ i)(p2||'0) then becomes 

f{t + A3) -{t + Xs)f{q,) = f{t - A3) -{t- A3)/'(q,). 

(37) 

This may be solved for Qz, yielding 

X - 1 



Qz = 



X + 1' 



where 



X = exp 



/(t + A3)-/(t-A3) 



2A3 



(38) 



(39) 



Recall that we are using notation where "exp" means 2 
to the power of the argument. The channel capacity is 
obtained by substituting into Thus the chan- 

nel capacity may be obtained analytically. The optimal 
ensemble may also be determined analytically. The opti- 
mal states correspond to points on the z axis at i ± A3 , 
and the probabilities are given by 



P± 



1 _^ Qz - 

2 2A3 



(40) 



For Condition 2 in Theorem [3 the optimal states are 
away from the z axis. Because V' must be the average 
of the two pk, and the z components of the two fk are 
equal, the z component of q must also be equal. If ip is 
optimal, for the solution of H19|l the z component of r 
should be equal to the z component of q. Therefore the 
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optimal ensemble may be found by finding the solution 
of (|19|l with qz — Tz- Thus finding the capacity in this 
case reduces to finding the zero of a function of a single 
real variable, which is easily performed numerically. 

As an alternative interpretation of this result, consider 
the ensemble consisting of two states corresponding to 
(j) = ±(t>o- The Holevo information of this ensemble is 
given by 

D{p±U) = l[fir)-firz)], (41) 

where V' is the average state. If the optimal ensemble is of 
this form, then the maximum of this quantity gives the 
Holevo capacity for the channel. Taking the derivative 
with respect to (f), we find that the maximum will be for 
a solution of ((T^ with — r^- 

For the case where A and t are as given in (|32|l . the 
problem of calculating the capacity has been considered 
in Ref. [31 ■ For this case, this reference gives an ana- 
lytic method for calculating the Holevo capacity for given 
mean state. Although this method was derived in quite a 
different way than the method given here, it is equivalent. 

In those cases where A G (0,1/2), it is still possible 
that two states may be sufficient for the optimal ensem- 
ble. In those cases, the ensemble must still consist of 
either two states on the z axis of the Bloch sphere, or 
two states corresponding to = ±(t>o, where 0o is a root 
of (|19|l . This result may be shown by considering D{p\\^p) 
as a function of 0. As was shown in the previous section, 
there can be at most three maxima of D{p\\tp). If there 
are only two, then these are ai (j) — and n or (j) — ±0o- 
In either case, the form of the optimal ensemble is the 
same as for channels satisfying the conditions of Theorem 
□ 

If there are three maxima, then one of these is on the 
z axis, and the other two are for (p = ±4>o- H two states 
are sufficient for the optimal ensemble, these states must 
correspond to = ±0o, because otherwise ip would not 
be on the z axis. Therefore, regardless of whether there 
are two maxima or three, if two states are sufficient for 
the optimal ensemble, then these consist of either two 
states on the z axis, or two states corresponding to (f> = 

±00. 

These results can be used to determine if the opti- 
mal ensemble requires three states in cases where A £ 
(0, 1/2). From the "sufficiency of maximal distance prop- 
erty" in [lol |. we know that the ensemble is optimal if 
there are no values of p that give values of DlpWtp) greater 
than the pk in the ensemble. Therefore, in order to de- 
termine if the ensemble requires more than two states, 
determine ip via the two different methods above. If, for 
one of them, D{p\\ijj) is maximised for the corresponding 
Pk, then the optimal ensemble requires only two states. 
If neither of these methods gives the optimal ensemble, 
then we have eliminated all possibilities for optimal two- 
state ensembles, and the optimal ensemble must require 
three states. 

It is also possible to efficiently determine the Holevo 
capacity in those cases where the ensemble requires three 



states. The reason for this is that the only unknowns for 
the three state ensemble are the value of 0o such that 
(j) — ±(/)o for the two off-axis states, and the probabilities 
for the three states. Given the value of 0o, there is an 
analytic method to determine the probabilities. There- 
fore the problem reduces to a numerical maximisation in 
a single real variable, which is easily performed. 

From Theorem O the state on the z axis will be at 
t + A3 if |i + A3I > \t- Aal, and t - A3 if |i 4- A3I < 
|i — A3I. Taking the other two states to correspond to (p — 
±</)o, the condition that the relative entropy D^pkWip) is 
independent of k becomes 

fit ± A3) - (i ± A3)/'(g.) = /(ro) - (i + A3 cos Mf'i<lz), 

(42) 

where rg = Af sin^ (/)o + {t + X^coscpo)^. We take the 
plus sign if |t + A3I > |t — A3I, and the minus sign if 
|i + A3I < |f — A3I. Solving for q^ gives 



where 



X = exp 



X - 1 
X + 1' 



/(^±A3)-/(ro) 

A3(±l - COS(/)o) 



(43) 



(44) 



Note that this solution is reasonable only if the value of 
qz obtained is between t ± A3 and i -f A3 cos 0oi otherwise 
negative probabilities would be required for the ensemble. 

Given this solution for q^, the common value of the 
relative entropy is given by 

DiPkU) = I [fit ± A3) - log(l - ql) - (t ± A3) \ogX] . 

(45) 

By finding the maximum of this (with Qz between i ± A3 
and t + A3 cos 00 ), the Holevo capacity may be deter- 
mined. 

This method was used to determine the difference be- 
tween the two-state capacity and the three-state capacity 
for a range of different maps. This difference is plotted 
as a function of A in Fig. |21 In addition, the states that 
maximise this difference were searched for numerically 
for given values of A; these results are also shown in Fig. 
12 It can be seen that the maximum difference in the 
capacities is still quite small; less than 0.004. Also, the 
difference can be nonzero in the entire interval (0, 1/2). 
The difference approaches zero quite rapidly as A ap- 
proaches 1/2, but is still nonzero. For comparison, two 
of the examples from Ref. ^ are shown in Fig. [3 It was 
also found that, regardless of the value of A, there were 
cases where two states were sufficient for the optimal en- 
semble. 



VI. CONCLUSIONS 

We have shown a number of results on the form of op- 
timal ensembles for qubit channels. The class of channels 
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FIG. 2: The difference between the two-state capacity and the 
three-state capacity versus the value of A. Random samples 
are shown as grey points, and the numerically obtained upper 
bound is shown as the solid line. The cross and plus are 
examples from Ref. The cross is for Ai — A2 = 0.6 and 
A3 = t = 0.5, and the plus is for \\ =t = 0.5 and A2 = A3 = 
0.435. 

considered includes those that can be simplified, via uni- 
tary operations before and after the channel, to a form 
that is symmetric under reflections in the x — z and y — z 
planes. This class includes extremal channels, and most 
examples of channels considered in previously published 
work. For these channels we have introduced the param- 
eter A, which can be interpreted in some cases in terms 
of the distance between the output ellipsoid and the unit 
sphere. 

The main result is that if A is not in the interval 



(0,1/2), then two states are sufficient for the ensemble 
that maximises the Holevo capacity. In addition, optimal 
two-state ensembles must consist of either two states on 
the z axis of the Bloch sphere, or two states on a line 
that is perpendicular to and intersecting the z axis. For 
cases where A ^ (0,1/2), we have presented a simple 
method to determine which form the optimal ensemble 
takes. This result also enables us to determine if the in- 
put states should be orthogonal or non-orthogonal. Even 
in cases where A S (0, 1/2), if two states are sufficient for 
the optimal ensemble, then the ensemble must take one 
of these two forms. 

For cases where three states are necessary for the op- 
timal ensemble, our results show that the optimal three- 
state ensemble consists of one state on the z axis at the 
maximum distance from the origin, and two states on a 
line perpendicular to and intersecting the z axis. This 
demonstrates that the form of the optimal three state 
ensembles found in Ref. 3 is universal. 

Lastly, we have provided a computationally efficient 
method of determining the Holevo capacity. For cases 
where the optimal ensemble consists of two states on the 
z axis, the capacity may be determined analytically. For 
other cases the calculation is a numerical maximisation 
of a function of a single real variable, which is easily 
performed. For the specific case of extremal channels, 
this method is equivalent to that given in Ref. "iJ] . 
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